Skip to content

Add 'make test' target to build and run correctness tests#19

Open
v1kko wants to merge 2 commits into
treecode:masterfrom
v1kko:add-make-test
Open

Add 'make test' target to build and run correctness tests#19
v1kko wants to merge 2 commits into
treecode:masterfrom
v1kko:add-make-test

Conversation

@v1kko

@v1kko v1kko commented Jun 1, 2026

Copy link
Copy Markdown

Add a top-level test target that builds the library and the test programs against it, then runs the GPU-vs-CPU correctness tests for the 4th-order, GRAPE5 (2nd-order) and 6th-order integrators. The target dispatches to the CUDA or OpenCL test Makefile based on the selected BACKEND, and sets LD_LIBRARY_PATH so the freshly built shared libraries are found at runtime. A build-tests target is also provided to compile the tests without running them, and clean now also cleans the tests directory.

Fix two pre-existing bugs in tests/Makefile_ocl that prevented the OpenCL tests from building/running: link against libsapporo2 (not the old libsapporo name) and symlink kernels from src/OpenCL.

v1kko and others added 2 commits June 1, 2026 09:32
Add a top-level `test` target that builds the library and the test
programs against it, then runs the GPU-vs-CPU correctness tests for the
4th-order, GRAPE5 (2nd-order) and 6th-order integrators. The target
dispatches to the CUDA or OpenCL test Makefile based on the selected
BACKEND, and sets LD_LIBRARY_PATH so the freshly built shared libraries
are found at runtime. A `build-tests` target is also provided to compile
the tests without running them, and `clean` now also cleans the tests
directory.

Fix two pre-existing bugs in tests/Makefile_ocl that prevented the
OpenCL tests from building/running: link against libsapporo2 (not the
old libsapporo name) and symlink kernels from src/OpenCL.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The objects and libsapporo2 are backend-specific (CUDA compiles cudadev.h,
OpenCL compiles ocldev.h via -D_OCL_), but Make only tracks file timestamps,
not the value of BACKEND. After building one backend, a subsequent build with
the other silently reused the existing objects, so e.g. an OpenCL test linked
against a CUDA-compiled libsapporo2 and aborted with CUDA_ERROR_INVALID_IMAGE
in cudadev.h when cuModuleLoad() was handed a .cl source file.

Record the active backend in a stamp file that the objects depend on, forcing
a recompile when BACKEND changes while keeping same-backend rebuilds
incremental.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@HannoSpreeuw

HannoSpreeuw commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator

Some progress report on this.

Commit c1292cb completes both make all BACKEND=CUDA and make test BACKEND=CUDA.
This is on node021 on DAS6 after module load cuda12.6/toolkit/12.6.

make all BACKEND=OpenCL also completes without error after module load opencl-nvidia/12.3.

make test BACKEND=OpenCL fails with

Cuda driver error <CUDA_ERROR_INVALID_IMAGE> in file 'src/cudadev.h' in line 763.
test_gravity_block_ocl: src/cudadev.h:763: void dev::kernel::load_source(const char*, const char*, const char*): Assertion `false' failed.
/bin/sh: line 1: 1803410 Aborted                 (core dumped)

when running test_gravity_block_ocl, i.e. after building the eight executables in the tests folder.

Commit 2bcbd49 does not fix that error, so I reverted it (locally).

@HannoSpreeuw

Copy link
Copy Markdown
Collaborator

On a different machine, with NVIDIA GPU and CUDA 13.1, after branching off master and git cherry-pick c1292cb, the error from make test BACKEND=OpenCL is slightly different:

=== Running test_gravity_block_ocl ===
 n = 1024
./test_gravity_block_ocl is using: n: 1024	Devices: 0	IntegrationOrder: 1	IntegrationPrecision: 1	File: OpenCL/kernels4th.cl
Integration order used: 1 (0=GRAPE5, 1=4th, 2=6th, 3=8th)
Integration precision used: 1 (0=FLOAT, 1 = DOUBLESINGLE, 2=DOUBLE)
Getting list of OpenCL devices ...
oclSafeCall() Runtime API error in file <src/ocldev.h>, line 251 : Unknown OpenCL error
.
test_gravity_block_ocl: src/ocldev.h:179: void dev::__oclsafeCall(cl_int, const char*, int): Assertion `false' failed.
make: *** [Makefile:241: test] Error 1

@HannoSpreeuw

Copy link
Copy Markdown
Collaborator

My AI tool insists that this error occurs because OpenCL packages have not been installed completely and/or my environment is not setup to point to the OpenCL libraries.

Indeed, I could bypass that error through conda install pocl, next make test BACKEND=OpenCL will yield a (slightly) different error

=== Running test_gravity_block_ocl ===
 n = 1024
./test_gravity_block_ocl is using: n: 1024	Devices: 0	IntegrationOrder: 1	IntegrationPrecision: 1	File: OpenCL/kernels4th.cl
Integration order used: 1 (0=GRAPE5, 1=4th, 2=6th, 3=8th)
Integration precision used: 1 (0=FLOAT, 1 = DOUBLESINGLE, 2=DOUBLE)
Getting list of OpenCL devices ...
 0: Portable Computing Language
Using platform 0 
oclSafeCall() Runtime API error in file <src/ocldev.h>, line 263 : Device not found.
.
test_gravity_block_ocl: src/ocldev.h:179: void dev::__oclsafeCall(cl_int, const char*, int): Assertion `false' failed.
make: *** [Makefile:241: test] Error 1

@v1kko

v1kko commented Jun 2, 2026

Copy link
Copy Markdown
Author

Right, that one was fixed by claude by temporarily patching the code and setting the Opencl target to CPU, it is hardcoded somewhere to be GPU, and pocl is a cpu-library...

@HannoSpreeuw

Copy link
Copy Markdown
Collaborator

I also added pocl through my OS package manager.
Fresh start: make clean.
make all BACKEND=OpenCL; make test BACKEND=OpenCL now yields another slightly different error:

=== Running test_gravity_block_ocl ===
 n = 1024
./test_gravity_block_ocl is using: n: 1024	Devices: 0	IntegrationOrder: 1	IntegrationPrecision: 1	File: OpenCL/kernels4th.cl
Integration order used: 1 (0=GRAPE5, 1=4th, 2=6th, 3=8th)
Integration precision used: 1 (0=FLOAT, 1 = DOUBLESINGLE, 2=DOUBLE)
Getting list of OpenCL devices ...
 0: Portable Computing Language
Using platform 0 
No GPU devices found on platform, trying CPU devices...
Found 1 suitable devices: 
 0: pthread-AMD Ryzen 9 7900X 12-Core Processor	Vendor: AuthenticAMD
Number of cpus available: 24
Number of gpus available: 1
integrationOrder : 1
Getting list of OpenCL devices ...
 0: Portable Computing Language
Using platform 0 
No GPU devices found on platform, trying CPU devices...
Found 1 suitable devices: 
 0: pthread-AMD Ryzen 9 7900X 12-Core Processor	Vendor: AuthenticAMD
Using device: 0
Device has: 24 	 multiprocessors 
Using  2 blocks per multi-processor for a total of : 48
Loading file:  OpenCL/kernels4th.cl 
Opening kernel file: OpenCL/kernels4th.cl
Compilation of the source file failed: OpenCL/kernels4th.cl 
Compiler output: 
 error: unknown target CPU 'generic'
 
test_gravity_block_ocl: src/ocldev.h:660: void dev::kernel::print_compiler_output(): Assertion `false' failed.
make: *** [Makefile:241: test] Error 1

My AI tool (GH Copilot) suggests that I should install an OpenCL runtime for my hardware (AMD CPU, NVIDIA GPU).

@HannoSpreeuw

HannoSpreeuw commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

With a fresh Python 3.14 Conda env and everything fresh: make clean; make all BACKEND=OpenCL; make test BACKEND=OpenCL:

=== Running test_gravity_block_ocl ===
 n = 1024
./test_gravity_block_ocl is using: n: 1024	Devices: 0	IntegrationOrder: 1	IntegrationPrecision: 1	File: OpenCL/kernels4th.cl
Integration order used: 1 (0=GRAPE5, 1=4th, 2=6th, 3=8th)
Integration precision used: 1 (0=FLOAT, 1 = DOUBLESINGLE, 2=DOUBLE)
Getting list of OpenCL devices ...
 0: AMD Accelerated Parallel Processing
 1: NVIDIA CUDA
Using platform 0 
Found 1 suitable devices: 
 0: gfx1036	Vendor: Advanced Micro Devices, Inc.
Number of cpus available: 24
Number of gpus available: 1
integrationOrder : 1
Getting list of OpenCL devices ...
 0: AMD Accelerated Parallel Processing
 1: NVIDIA CUDA
Using platform 0 
Found 1 suitable devices: 
 0: gfx1036	Vendor: Advanced Micro Devices, Inc.
Using device: 0
Device has: 1 	 multiprocessors 
Using  2 blocks per multi-processor for a total of : 2
Loading file:  OpenCL/kernels4th.cl 
Opening kernel file: OpenCL/kernels4th.cl
Loading file:  OpenCL/kernels4th.cl 
Opening kernel file: OpenCL/kernels4th.cl
Loading file:  OpenCL/kernels4th.cl 
Opening kernel file: OpenCL/kernels4th.cl
Loading file:  OpenCL/kernels4th.cl 
Opening kernel file: OpenCL/kernels4th.cl
Kernel files found .. building compute kernels! 
Creating kernel dev_copy_particles 
Maximum work group size: 256 Optimal work group multiple: 32 
Creating kernel dev_predictor 
Maximum work group size: 256 Optimal work group multiple: 32 
Creating kernel dev_evaluate_gravity_fourth_DS 
Maximum work group size: 256 Optimal work group multiple: 32 
Creating kernel dev_reset_buffers 
Maximum work group size: 256 Optimal work group multiple: 32 
oclSafeCall() Runtime API error in file <src/ocldev.h>, line 881 : Invalid work group size
. Kernel name: dev_evaluate_gravity_fourth_DS
test_gravity_block_ocl: src/ocldev.h:187: void dev::__oclsafeCallKernel(cl_int, const char*, const char*, int): Assertion `false' failed.
make: *** [Makefile:241: test] Fout 1

Invalid work group size is something that has to do with the code, not with with my env.

And it is using platform 0 (CPU) instead of my GPU.

@HannoSpreeuw

HannoSpreeuw commented Jun 2, 2026

Copy link
Copy Markdown
Collaborator

And it is using platform 0 (CPU) instead of my GPU.

Got that fixed to use my GPU and it runs at 100% utilisation.

Unfortunately, it is still running, way longer than with the CUDA backend (2s or so).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants